Skip to content

improv(tests): adopt new aws cdk cli library #1624

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

dreamorosi
Copy link
Contributor

Description of your changes

This PR introduces a new package to the workspace dedicated to testing utilities. Historically the commons package has been a catch-all package for all shared utilities and code that didn't belong to any specific package. This has caused the package to sprawl (#1354) and while this hasn't always had an impact on what gets published to npm in the commons utility, the repository could use some reorganization.

In the case of integration/e2e tests specifically, there's a lot of room for improvement (#1314) since one of the most critical pieces of our testing infrastructure currently relies on an internal API from CDK that could (and has) change(d) without notice.

As a part of the new internal & unpublished testing utility, the PR introduces a new class called TestStack that relies on the @aws-cdk/cli-lib-alpha module, which allows to interact with the AWS CDK CLI programmatically. This class incorporates many of the lessons learned and aims at streamlining the boilerplate code needed to setup the infrastructure for a test, while standardizing the resources deployed as part of it.

Technical deep dive

To articulate the benefits of the new TestStack class, let's try to partially refactor the makeHandlerIdempotent tests. This test suite is fairly recent and presents many of the issues and sprawl that the new class aims to solve.

At a high level, our integration tests can be divided (like many tests of this type) in three discrete areas:

  • Setup
  • Testing (aka the test cases)
  • Teardown

While this PR focuses on the first and last phases, and specifically on the "Setup" one, the new class could address some of the optimizations in the "Testing" phase in future PRs (see "Future improvements" section below).

In order to setup the architecture needed for the tests, this specific one as well as all others, all have a version of the code below:

Note
Comments were added for clarity, but are not present in the source

// Validate that the specified runtime is valid (present in all tests)
const runtime: string = process.env.RUNTIME || 'nodejs18x';

if (!isValidRuntimeKey(runtime)) {
  throw new Error(`Invalid runtime key value: ${runtime}`);
}

// Generate an unique name for the test stack (this allows to run multiple e2e tests at the same time)
const uuid = v4();
const stackName = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  'makeFnIdempotent'
);

// Specify the filename for the Lambda that will be deploy to run the test (this is unique to each test suite)
const makeHandlerIdempotentFile = 'makeHandlerIdempotent.test.FunctionCode.ts';

// Initialise resources to deploy & interact with the stack
const app = new App();
const ddb = new DynamoDBClient({});
const stack = new Stack(app, stackName);

/**
 * Add resources to the stack, in this case we are:
 * - specifying an unique name for the test case
 * - generating an unique name for the function (using the name from above)
 * - generating an unique name for the Dynamo DB Table (using the name from above)
 * - adding the CDK resources to the Stack (other file)
 *
 * NOTE: this is only for one test case
 */
const testDefault = 'default-sequential';
const functionNameDefault = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testDefault}-fn`
);
const ddbTableNameDefault = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testDefault}-table`
);
createIdempotencyResources(
  stack,
  runtime,
  ddbTableNameDefault,
  makeHandlerIdempotentFile,
  functionNameDefault,
  'handler'
);

// Doing the same as above, but for another test case (minimal differences)
const testDefaultParallel = 'default-parallel';
const functionNameDefaultParallel = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testDefaultParallel}-fn`
);
const ddbTableNameDefaultParallel = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testDefaultParallel}-table`
);
createIdempotencyResources(
  stack,
  runtime,
  ddbTableNameDefaultParallel,
  makeHandlerIdempotentFile,
  functionNameDefaultParallel,
  'handlerParallel'
);

// And again
const testTimeout = 'timeout';
const functionNameTimeout = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testTimeout}-fn`
);
const ddbTableNameTimeout = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testTimeout}-table`
);
createIdempotencyResources(
  stack,
  runtime,
  ddbTableNameTimeout,
  makeHandlerIdempotentFile,
  functionNameTimeout,
  'handlerTimeout',
  undefined,
  2
);

// And again..
const testExpired = 'expired';
const functionNameExpired = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testExpired}-fn`
);
const ddbTableNameExpired = generateUniqueName(
  RESOURCE_NAME_PREFIX,
  uuid,
  runtime,
  `${testExpired}-table`
);
createIdempotencyResources(
  stack,
  runtime,
  ddbTableNameExpired,
  makeHandlerIdempotentFile,
  functionNameExpired,
  'handlerExpired',
  undefined,
  2
);

As you can see, the code above is fairly verbose and contains a lot of boilerplate. Additionally, there's another createIdempotencyResources function referenced. This function is in charge of adding a NodejsFunction and a Table CDK constructs to the stack for each test and most of the utilities (Tracer, Logger, Metrics, Idempotency, etc.) define a version of this function to deploy resources.

Once all the resources have been added to the CDK stack, in the beforeAll phase of the test we actually deploy the stack:

beforeAll(async () => {
    await deployStack(app, stack);
  }, SETUP_TIMEOUT);

Now, let's refactor the code above using the new TestStack class that is introduced in this PR:

Note
Once again, comments are added for clarity

// Import the new class
import { TestStack } from '../../../testing';

// Initialize the the AWS SDK client(s) to interact with the deployed stack
const ddb = new DynamoDBClient({});

// Initialise the stack, passing a common resource prefix and test name,
// these will be used to generate unique names for each new resource
const stack = new TestStack(RESOURCE_NAME_PREFIX, 'makeHandlerIdempotent');
// Create the full path of the file that contains the Lambda fn (this has to be done here because it's relative to the test file)
const functionFileName = join(
  __dirname,
  'makeHandlerIdempotent.test.FunctionCode.ts'
);

beforeAll(async () => {
  /**
   * For each new case, define:
   * - a name for the test
   * - a function (that can be fully customised - see types in the PR diff)
   * - other resources (in this case a DynamoDB table)
  */
  stack.addTestCase({
    testCaseName: "default",
    function: {
      functionCode: {
        path: functionFileName,
        handler: "handler",
      },
    },
    dynamodb: {
      envVariableName: "IDEMPOTENCY_TABLE_NAME",
    },
  });

  // Do this of each test case
  stack.addTestCase({
    testCaseName: "default-parallel",
    function: {
      functionCode: {
        path: functionFileName,
        handler: "handlerParallel",
      },
    },
    dynamodb: {
      envVariableName: "IDEMPOTENCY_TABLE_NAME",
    },
  });

  stack.addTestCase({
    testCaseName: "timeout",
    function: {
      functionCode: {
        path: functionFileName,
        handler: "handlerTimeout",
      },
      functionConfigs: {
        timeout: 2,
      },
    },
    dynamodb: {
      envVariableName: "IDEMPOTENCY_TABLE_NAME",
    },
  });

  stack.addTestCase({
    testCaseName: "expired",
    function: {
      functionCode: {
        path: functionFileName,
        handler: "handlerExpired",
      },
      functionConfigs: {
        timeout: 2,
      },
    },
    dynamodb: {
      envVariableName: "IDEMPOTENCY_TABLE_NAME",
    },
  });

  await stack.deploy();
}, SETUP_TIMEOUT);

The benefit of this new utility are:

  • Centralized definition of test resources (i.e. all tests have same default settings for Lambda, DynamoDB, etc.)
  • Remove code duplication / maintenance overhead
  • Remove reliance on the internal CDK APIs

Future improvements

As mentioned, at the moment we are improving only the setup phase. In future PRs we could continue centralizing other parts of the integration tests like the invocation of the Lambda function in a test case, or the retrieval of CloudWatch logs for that test.

Give that each test case is now an object, we could add methods like:

const testCase = stack.getTestCase('default');
const logs = await testCase.invokeFunctionSequentially({ // returns the Cw logs of those 3 invocations
  payload: { ... },
  number: 3,
});

or:

const testCase = stack.getTestCase('default');
await testCase.invokeFunctionInParallel({
  payload: { ... },
  number: 3
});
const traces = await testCase.getTraces(); // retrieves all the traces of those 3 invocations

Related issues, RFCs

Issue number: #1623

Checklist

  • My changes meet the tenets criteria
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in areas that should be flagged with a TODO, or hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my change is effective and works
  • The PR title follows the conventional commit semantics

Breaking change checklist

Is it a breaking change?: NO

  • I have documented the migration process
  • I have added, implemented necessary warnings (if it can live side by side)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

@dreamorosi dreamorosi requested a review from a team July 24, 2023 16:00
@dreamorosi dreamorosi self-assigned this Jul 24, 2023
@dreamorosi dreamorosi linked an issue Jul 24, 2023 that may be closed by this pull request
2 tasks
@boring-cyborg boring-cyborg bot added the dependencies Changes that touch dependencies, e.g. Dependabot, etc. label Jul 24, 2023
@pull-request-size pull-request-size bot added the size/XXL PRs with 1K+ LOC, largely documentation related label Jul 24, 2023
@github-actions github-actions bot added the enhancement PRs that introduce minor changes, usually to existing features label Jul 24, 2023
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jul 24, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
1.8% 1.8% Duplication

Copy link
Contributor

@am29d am29d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, Andrea! Finally the testing resources got their dedicated place outside of the commons package.

I see this PR as a great improvement step to the previous structure and usage. I still have few concerns I'd like to address, which might require a bigger refactoring that do not belong here.

Name overload
We overload resource names in many places, i.e. NodeJsFunction is our own wrapper, plus the actual CDK class, plus the factory that creates the CDK NodeJsFunction. Over time it will make our life harder to reason and distinguish about them, so we need a good separation in the name between our internal resources and the CDK classes.

Resource zoo
I notices that we currently support few resources, like DynamoDB table, NodeJsFunction and SSMParameters. They are all sitting in one class, wired through multiple if-statements so we know how to construct the TestCase. The list of resources will grow and with 10-20 different resources it'd require new restructure. There are multiple places in the code base that we need to change if would want to add support for SQS queue. In such cases it is often challenging to draw the line when to refactor, is it 5 or 8 or 10 resources that are "too much". My proposal would be to do it rather sooner than later, not in this PR but in the next iteration of the tests package. I don't have a particular design in mind, the driving force is primarily "what is the best setup so it'd be easier to add support for X in the future".

CDK integ lib
The examples you have highlighted in the PR description reminded me of the integ-test-alpha package of the CDK project. While our testing code was written before the cdk testing package existed, it seems like our implementation might converge into the same direction and it wouldn't be a surprise if we build a very similar assertion helper. Thus, we need to review cdk-integ-test-alpha package and see if and how much of the package we can use that'd make our life easier. It's a trade-off of course to reduce the code base for the cost of being squared into the library capabilities, and I think it's worth reviewing (again not here, in a separate PR).

import type { IStringParameter, StringParameter } from 'aws-cdk-lib/aws-ssm';
import { PolicyStatement, Effect } from 'aws-cdk-lib/aws-iam';

class NodeJsFunction {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name overload can be confusing, I see this class is internal, so it won't be exposed for maintainers when they write tests. Still, I'd propose to distinguish between our resource wrapper and the actual resource we hold.

}

public get envVariableName(): string {
return this.#envVariableName ?? 'TABLE_NAME';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this check to the constructor?

*
* @default - No output
*/
logGroupOutputKey?: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth to also add the retention period with a sensible default, maybe a week or 5 days. What do you think?

@dreamorosi
Copy link
Contributor Author

Hey Alex, thanks for the review - I agree with the points you've made and I think it's important to get this right (or at least better than it is at the moment) now.

I'll move back the PR to draft status and try to address some the comments.

@dreamorosi dreamorosi marked this pull request as draft July 25, 2023 08:54
@dreamorosi
Copy link
Contributor Author

After discussing this further, me & @am29d have decided that we are going to take this back to the drawing board and consider working on a RFC before moving onto implementation.

If anyone stumbles into this & would like to contribute please get in touch.

@dreamorosi dreamorosi closed this Jul 25, 2023
@dreamorosi dreamorosi deleted the 1623-maintenance-adopt-aws-cdk-cli-library branch August 29, 2023 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Changes that touch dependencies, e.g. Dependabot, etc. enhancement PRs that introduce minor changes, usually to existing features size/XXL PRs with 1K+ LOC, largely documentation related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants